Global Nash convergence of Foster and Young's regret testing

نویسندگان

  • Fabrizio Germano
  • Gábor Lugosi
چکیده

We construct an uncoupled randomized strategy of repeated play such that, if every player plays according to it, mixed action profiles converge almost surely to a Nash equilibrium of the stage game. The strategy requires very little in terms of information about the game, as players’ actions are based only on their own past payoffs. Moreover, in a variant of the procedure, players need not know that there are other players in the game and that payoffs are determined through other players’ actions. The procedure works for finite generic games and is based on appropriate modifications of a simple stochastic learning rule introduced by Foster and Young [12].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Regret Testing: A Simple Payo¤-Based Procedure for Learning Nash Equilibrium1

A learning rule is uncoupled if a player does not condition his strategy on the opponent’s payo¤s. It is radically uncoupled if a player does not condition his strategy on the opponent’s actions or payo¤s. We demonstrate a family of simple, radically uncoupled learning rules whose period-by-period behavior comes arbitrarily close to Nash equilibrium behavior in any …nite two-person game. Keywor...

متن کامل

Regret testing: learning to play Nash equilibrium without knowing you have an opponent

A learning rule is uncoupled if a player does not condition his strategy on the opponent’s payoffs. It is radically uncoupled if a player does not condition his strategy on the opponent’s actions or payoffs. We demonstrate a family of simple, radically uncoupled learning rules whose period-by-period behavior comes arbitrarily close to Nash equilibrium behavior in any finite two-person game.

متن کامل

Continuous and Global Stability in Innovative Evolutionary Dynamics

Innovation plays a central role in the development of modern economies, as does the regret of those who have missed the opportunity to try a successful new strategy. In contrast to purely biological environments, where new strategies emerge mainly by random mutation, human societies tend to exhibit more deliberate, although possibly imperfect inventions of new strategies. In this paper, we stud...

متن کامل

On the convergence of no-regret learning in selfish routing

We study the repeated, non-atomic routing game, in which selfish players make a sequence of routing decisions. We consider a model in which players use regret-minimizing algorithms as the learning mechanism, and study the resulting dynamics. We are concerned in particular with the convergence to the set of Nash equilibria of the routing game. No-regret learning algorithms are known to guarantee...

متن کامل

Unifying Convergence and No-Regret in Multiagent Learning

We present a new multiagent learning algorithm, RVσ(t), that builds on an earlier version, ReDVaLeR . ReDVaLeR could guarantee (a) convergence to best response against stationary opponents and either (b) constant bounded regret against arbitrary opponents, or (c) convergence to Nash equilibrium policies in self-play. But it makes two strong assumptions: (1) that it can distinguish between self-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Games and Economic Behavior

دوره 60  شماره 

صفحات  -

تاریخ انتشار 2007